Regret Bounds for Lifelong Learning
نویسندگان
چکیده
We consider the problem of transfer learning in an online setting. Di↵erent tasks are presented sequentially and processed by a within-task algorithm. We propose a lifelong learning strategy which refines the underlying data representation used by the withintask algorithm, thereby transferring information from one task to the next. We show that when the within-task algorithm comes with some regret bound, our strategy inherits this good property. Our bounds are in expectation for a general loss function, and uniform for a convex loss. We discuss applications to dictionary learning and finite set of predictors. In the latter case, we improve previous O(1/ p m) bounds to O(1/m), where m is the per task sample size.
منابع مشابه
Sequential Transfer in Multi-armed Bandit with Finite Set of Models
Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer...
متن کاملSafe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret
Lifelong reinforcement learning provides a promising framework for developing versatile agents that can accumulate knowledge over a lifetime of experience and rapidly learn new tasks by building upon prior knowledge. However, current lifelong learning methods exhibit non-vanishing regret as the amount of experience increases, and include limitations that can lead to suboptimal or unsafe control...
متن کاملExplanation-based neural network learning a lifelong learning approach
Now welcome, the most inspiring book today from a very professional writer in the world, explanation based neural network learning a lifelong learning approach. This is the book that many people in the world waiting for to publish. After the announced of this book, the book lovers are really curious to see how this book is actually. Are you one of them? That's very proper. You may not be regret...
متن کاملLogarithmic Online Regret Bounds for Undiscounted Reinforcement Learning
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds for the algorithm’s online performance after some finite number of steps. In the spirit of similar methods already successfully applied for the exploration-exploitation tradeoff in multi-armed bandit problems, we use upper confidence bounds to show that our UCRL algorithm achieves logarithmic on...
متن کاملSurrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses
The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be form...
متن کامل